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Abstract 

In a recent paper in this journal |j. Stat. Mech. (2009) P02037 ] we proposed a new, physically motivated, 



distribution function for modeling individual incomes having its roots in the framework of the K-generalized 
statistical mechanics. The performance of the ^-generalized distribution was checked against real data on 
personal income for the United States in 2003. In this paper we extend our previous model so as to be 
able to account for the distribution of wealth. Probabilistic functions and inequality measures of this 
generalized model for wealth distribution are obtained in closed form. In order to check the validity of the 
proposed model, we analyze the U.S. household wealth distributions from 1984 to 2009 and conclude an 
excellent agreement with the data that is superior to any other model already known in the literature. 
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1. Introduction 

The quantitative and formal development of the personal or size distribution of income and the mea- 
surement of income inequality was first introduced by the Italian economist Vilfredo Pareto. He specified 
his type I model early in 1895 jij, and in 1896 and 1897 his types II and III 0-0); and made an inequality 
interpretation of his shape parameter. Based on Pareto's economic foundations, and on the stochastic foun- 
dations afterward developed by other authors 

SB, the Pareto law (Pareto type I) is now overwhelmingly 
considered as the income distribution model of high income groups. 

After Pareto's seminal contribution, many probability density functions have been proposed in the 
literature that are suitable for describing the size distribution of income amongst the population as a 
whole — see e.g. the comprehensive survey contained in [8]. Fitting of parametric functional forms has 
also been common for the distribution of uieaZi/iQ However, the problem for the wealth researcher is 
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1 Income and wealth are commonly used to assess the economic well-being of individuals, families or households. Although 
some correlation exists between them, the relationship is not perfect: greater income is likely to mean greater wealth, but not 
always. The two measures, in fact, are not synonymous. Income is a flow, since it is meaningful only when defined in relation 
to a period of time (hourly, weekly, monthly or annual income). Wealth is a stock, increasing as new assets are acquired or 
savings accumulated, and the only time information required is when the stock was measured (no periodicity is necessary). 
The link between the flow from income and the stock of wealth is obvious: the greater the former, the more rapidly the latter 
will increase. Accordingly, a high income may be associated with low wealth — this is the case, for example, with young people 
starting their careers; on the other hand, a low income could accompany high wealth — this is the case with some retirees 
who have little income but who have accumulated and paid for substantial assets. At a practical level, wealth is distributed 
much more unequally than income because of life cycle savings and bequest motives 0. Data on stocks of wealth also present 
distinctive features in comparison with income data that make empirical analysis non-standard in several ways (see the ongoing 
discussion above for details). However, as far as the shape of the particular distribution is concerned, income and wealth share 
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that virtually all of the models suggested within the context of the income distribution literature are 
defined for variables taking only strictly positive values, although published statistical data of wealth 
distributions give clear evidence of presenting highly significant frequencies of households or individuals 
with null and/or negative wealth. The early contributions systematically dismissed these frequencies and 
fitted their respective proposed models to the positive observations only, thus omitting a significant part 
of the storyU 



To the best of our knowledge, Dagum [la, lla] was the first and only one to specify and test a four- 



parameter model for wealth distributions (Dagum type II). The fourth parameter in the Dagum model is 
an estimate of the frequency of economic units with wealth equal to zero. This model is highly relevant 
to describe total (gross) wealth distribution because of the always large observed percentage of economic 
units with null total wealth. Dagum 17-2(3] made further developments of his type II model to analyze 
the distribution of net wealth, which is equal to gross wealth minus total debt. The support of the Dagum 
model of net wealth is the real line M = (—00,00), thus allowing to fit the subset of economic units with 
null and negative wealth. Furthermore, it contains as particular cases both the Dagum types I and II 
distributions [laV 

More in detail, the Dagum general model of net wealth distribution is a mixture (or a convex com- 
bination) of an atomic and two continuous distributions. The atomic distribution concentrates its unit 
mass of economic agents at zero, and therefore accounts for the economic units with null net wealth. The 
continuous distribution accounting for the negative net wealth observations is given by a Weibull func- 
tion. It has a fast left tail convergence to zero, and therefore it has finite moments of all orders. The 
other continuous distribution, specified as the Dagum type I model, accounts for the positive values of net 
wealth and presents a heavy right tail, thus having a small number of finite moments of positive order. 
This different behavior at the two tails of the distribution stems form the fact that, unlike the right tail of 
income and (gross or net) wealth distributions — which tend slowly to zero when income and wealth tend 
to infinity, the distribution of the negative values (left tail) of net wealth tends very fast to zero when the 
variable tends to minus infinity, since economic units face a short term challenge of either moving out of 
the negative range of net wealth or bankruptcy. 

The purpose of the present work is to provide estimates for the 1984-2009 U.S. net wealth distributions 
of this Dagum general model, partly motivated by the fact that there are no applications other than 
Dagum's ones 17J, LL9|] that we are aware of — the only notable exception being represented by 2l|], who 
fitted the model to Finnish net wealth data in 1984 and 1989. Furthermore, since other approaches can 
be entertained and comparative study of their relative merits performed, we also explore the possibility of 
using alternative distributions to characterize positive net wealth values. That is, we formalize, analyze 
and fit to our U.S. net wealth data finite mixture models based upon the Singh-Maddala and k- generalized 
distributions as specifications for the positive values. The Singh-Maddala [z^] is known to be very successful 
in fitting the empirical income distributions. The Ac-generalized was proposed in previous works of us [1I.I23I- 
2f| to describe the distribution of personal income in some developed economies for different years. Positive 
conclusions were drawn about its ability to provide an accurate description of the observed distributions, 
ranging from the low to the middle region, and up to the right tail. The empirical success of the k- 
generalized was complemented by goodness-of-fit comparisons showing that fitting the distribution to 
available income data offers superior performance over other existing models (including the Singh-Maddala 
and Dagum type I) in a significant number of cases. 

The content of the paper is organized as follows: Section [2] recalls some basic properties of the k- 
generalized statistical distribution; Section [3] presents the main analytical properties of the net wealth 



qualitatively the same characteristics: many empirical wealth distributions are indeed positively skewed with "fat" and long 
right-hand tails, as are income distributions. 

2 In the 1950s, Refs. [l(J and [ll| proposed the Pareto type I model and the lognormal distribution, respectively. Afterward, 
other models were proposed: in 1969 the Pareto types I and II by [13] ; in 1975, the log-logistic by [lj| and the Pearson type 
V by [l^| . All of these models are restricted to describe only the positive range of wealth, since they are not defined for zero 
and/or negative values. 
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distribution models; Section [J] deduces their corresponding moments; Section [5] derives the parametric 
forms of the Lorenz curve and Gini ratio for the distribution of net wealth; Section [6] fits the specified 
models to the U.S. data on household net wealth covering the years 1984-2009; and Section [7] presents the 
conclusions. 



2. The K-generalized statistical distribution and its properties 

After 2001, a physical mechanism emerging in the context of special relativity was proposed by one of 



us 



27H30j], predicting a deformation of the exponential function. According to this mechanism, the classical 



exponential distribution transforms into a new distribution, which at high energies presents a Pareto fat 
tail. More precisely, this mechanism deforms the ordinary exponential function exp (x) into the generalized 
exponential function exp K (x) given by 



exp K \x) 



yl + t 2 x 2 + kx 



(1) 



The above deformation is generated by the fact that the propagation of the information has a finite 
speed, and the deformation parameter k is proportional to the reciprocal of this speed. The K-generalized 
exponential has the important properties 



exp K (x) 



->±oo 



\2kx 



exp K (x) ~ exp(x). 



(2a) 
(2b) 



It is remarkable that for classical systems where the information propagates instantaneously it results 
k = 0, so that the ordinary exponential emerges naturally after noting that exp (x) = exp (x). Moreover, 
in the low energy region x — > according to Eq. (|2b|) the exponential distribution emerges again, because 
the system behaves classically. On the contrary, in systems where the information propagates with a finite 
speed — these systems are intrinsically relativistic — it results k ^ 0, so that the exponential tails become 
fat according to Eq. (j2aj) and the Pareto law emerges. 

The generalized exponential represents a very useful and powerful tool to formulate a new statistical 
theory capable to treat systems described by distribution functions exhibiting power-law tails and admitting 
a stable entropy 3l|, |32|. Furthermore, non-linear evolution models already known in statistical physics 
3314351] can be easily adapted or generalized within the new theory. 

The function exjxfx) was also adopted successfully in the analysis of various non physical systems 
liS]- InRefs. QBH] we have used the function exp K (x) to model the personal income distribution 
by defining the cumulative distribution function through 



F(x) = 1 -exp K [-(x/(3) a ] , x>0, a,/3>0, KE [0,1). 



The corresponding probability density function reads 

/(I)= ?(i 



1 + /-' 2 ( f ) 



(3) 



(4) 



It follows immediately that for low incomes the distribution function behaves similarly to the Weibull model 
F (x) = 1 — exp [— (x//3) a ], whereas for large x it approaches a Pareto distribution with scale (3 {2k) ~q 



and shape — , i.e. Fix) ~ 1 
Weibull distribution / (x) 



Similarly, the density function for x — > + behaves as a 



3 \P 



a-l 



exp [— (x//?) a ], while for x — > +oo it reduces to the Pareto's law 



fix) 
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3. Specification of finite mixture models for net wealth distribution 



The general model of net wealth distribution as a mixture of an atomic and two continuous distributions 
takes the form 



f{w)=^20ifi(w), -oo<w<oo, 9i>0, ^0i = l, 



(5) 



i=i 



where w denotes the wealth variable and {0{\ i=1 3 are the mixture proportions. The two-parameter 
Weibull density 

| it; | 
T 



8-1 

/lW = T T ex P 



w<0, (s,A)>0 



(6) 



A V A 

describes the distribution of economic units with negative net wealth, while the null net wealth observations 
are accounted for by a distribution that concentrates its unit mass at w = 0, i.e. 

/ 2 (0) = 1. (7) 

The other continuous distribution, f% (w), accounts for the positive values of net wealth, and is alternatively 
specified by the following three-parameter densities: 

1. the Singh-Maddala 



2. the Dagum type I 



/ 3 D H = 



aqw 



a-1 



b» [l + (f) 
.ap—1 



a-i 1+q ' 



w > 0, (a,b,q) > 0; 



apw 



— — T , w>0, (a,6,p)>0; 



3. the K-generalized given by Eq. @. 
The corresponding cumulative distribution function reads 

F (w) = O^x (w) + 6 2 F 2 (w) + 9 3 F 3 (w) , Q x + 6 2 = p, 6 3 

where 



Fi(w) 



. \ w \ 
1, w > 0; 



w < 0; 



F* (w) 



JO, w < 0; 
[1, w>0; 

0, w<0; 

F 3 (u>) , to > 0. 



, w < 0; 



Hence 

9\ exp 
i? W= ft w = 0; 

k P+(l-p)F 3 H, u>>0, 
with F3 (w) having the following alternative mathematical specifications 

-1 



F| M (w) = 1 



1 + , r 



1 + 



-it' 



(w) = l- exp K 



-v 
w 



(8) 
(9) 



(10) 

(11a) 
(lib) 
(11c) 

(12) 

(13a) 
(13b) 
(13c) 
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4. Moments of finite mixture models for net wealth distribution 



It follows from model ([5]) that the rth-order moment about the origin 

oo 

fjL r = E(W r ) = f w r f(w)dw = e 1 E 1 (W r )+0 2 E 2 (W r ) + e 3 E 3 (W r ), (14) 



where 



Ei (W r ) = (-l) r AT (! + -") (15) 



s 



and E 2 (W r ) = 0. 

As for E 3 (W r ) in the last member of Eq. ()14|) . according to the alternative distributions considered to 
characterize positive net wealth values one get^ 



■ 4 



,sm ,„ m vr (i+|)r(g-£ 



E$ M (W r ) = v ^ ^ aJ ; (16a) 



.- 1 Dmm_ brr (P+5) r ( 1 -i). 



^ (vr) = g ^ v a_J_. (16b) 



sr en = ^ (2k) - S r (i + pr( 2 ^ ;) 



The mean net wealth equals 



Ml =^(W) = -eiAT(l + -]+0 3 ^(W), (17) 



where £"3 (VF) is alternatively given by Eqs. (|16p with r = 1. 



5. The Lorenz curve and the Gini ratio of the net wealth distribution models 



By definition, the Lorenz curve 39j] describes a relation between the cumulative distribution function, 



F(w), and the first cumulative moment distribution function, given by 

w u 

L(u) = — Jw'f (V) dm' = — J w (V) du, u £ [0, 1] , (18) 



3 In what follows, T (■) stands for the Euler gamma function. 



4 See [1] for relevant expressions. Formulas for the moments of the K-generalized distribution are given in [H. li^ - liil ] . 



where u = F (w) and w (u) = F 1 (u) denotes the quantile function. Given the mathematical structure of 
the general net wealth distribution model (|5|) and (jlOp . we have 



L SM (u) 



Mi V s U J 

\9i 



r i + - 

Mi V s 

— | (1 - p) bq 
Mi 



-XOxT 1 + - 

s 



r -^rii + ^, 

Ml v s 



h < u < p; 



B[q-l,l + - 

a a 





"1 


— u 








1 


-p. 



5 1 1 

o a 



(19a) 



u> p; 



A0 
Mi 



1 g \ 

-log—), O<m<0i; 
s u J 

ir(i + i), 



'i < w < p; 



(19b) 



— J (l-p)b P B 
Mi 



u — p 



p 1 1 \ / 1 

;p+~,l-~ -A^r 1 + - 
a a / \ s 



u> p; 



L «-gen = ( 



^irfl + Ilog^V 0<«<* i; 
Mi V s n / 



f A#i 

Mi 

Mi 

1 f(l-p)/3 



ri l + l 

Mi V s 



Mi \ (2k) 1+ - 
-XdxT (l + 



h < it < p; 



B|i— L.1 + I 

2k 2a a 



1-p 



2/,- 



2k 2a a 



(19c) 



1 



« > P, 



where £>(•,•) and 5 (•;■,•) denote, respectively, the complete and incomplete Euler beta functions. Eqs. 
(|19p determine the path of the net wealth Lorenz curve L (u) over the closed interval [0, 1] for the different 
specifications of the net wealth finite mixture model. It follows that for u = 1, L (1) = 1. 

Since the net wealth Lorenz curve presents negative values for all u < p, it can be proved that the Gini 
inequality ratio takes the form 

0,0] 



G= I 



where 



( i 

2 J [v — L ( ii)} d a }/[\ +p\L(0uI 
I 

0i 



1 

1-2 J L(u)du 



/[i-pLm, 





Using Eqs. (fT9j) . the Gini ratio becomes 



J L(u)du = J L(u)du + J L{u)du + j L(u)du. 



(20) 



(21) 



9i 



G 



SRI 



Mi "2 



(1 - pf bqB (2g - 1, 1 + I) - A0x (l - ^"i) r (l + ±) 



pi + P A%r(i + i) 

D _Mi-2{(l-p) 2 fep[j?(p+j,l-i)-i?(2p + i,l-i)]-A^ (l-^rll + l)}^ 

p l + p\e l T(\ + \) 



(22a) 
(22b) 



6 



'K-gen 



Mi - 




•)r(i + J) 



(22c) 



M! + pA0ir(i + i) 



6. Application 

6. 1. The U.S. data on household net wealth 

The empirical analysis is based on data drawn from the Panel Study of Income Dynamics (PSID), a 
nationally representative household survey collected by the Survey Research Center at the University of 
Michigan since 1968. The PSID provides detailed information about economic, demographic, sociological 
and psychological aspects of many U.S. households. Since the focus is on the distribution of wealth, we use 
all (nine) waves currently available of the special PSID supplement asking information on household wealth 
holdings. This supplement was added in 1984 and was conducted on a periodic basis prior to 1999 (in 
1984, 1989 and 1994). After 1997 the basic PSID survey switched to biennial data collection, and starting 
with 1999 the wealth questions have been included in each wave (1999, 2001, 2003, 2005, 2007 and 2009). 

As shown in Table [H the number of households participating in the various waves varies between 6 
and 9 thousand, providing samples for analysis that are reasonably representative of the "true" wealth 
distribution in the U.sJl In particular, we are concerned with the distribution of net wealth, which is 
constructed as sum of values of several asset types net of debt held by each household^ Since net wealth is 
expressed in nominal local currency units, all figures have been deflated to allow for meaningful comparisons 
over the period covered by the data. To do so, we have employed the Consumer Price Index deflator (yearly 
series based on year 2005) provided by the OECdQ Furthermore, after a simple adjustment for differences 
in relative needs of households according to their size|f] net wealth values have been weighted by using 
appropriate sampling weights provided by the PSID staff in order to produce representative estimates for 
all households in the target population. 

Table[T] also provides a number of summary statistics. Consider first the prevalence of zero and negative 
values. On the basis of the PSID data, the proportion of households with negative net wealth rose steadily 
between 1984 and 2009 (from less than 7% to over 14%) whilst the proportion of households with zero net 
wealth increased somewhat between 1984 and 1994 (from slightly more than 4% to about 5%) followed 



5 For more on this issue, see for instance [4y| and [411 ] . In particular, measured against the standards set by two prominent 
American household wealth surveys — the Survey of Consumer Finances (SCF) and the Survey of Income and Program Partici- 
pation (SIPP) — the PSID does not differ substantially from them when it comes to measuring total wealth and its distribution 
among the great bulk of the U.S. population. Moreover, its measurement error characteristics look to be consistently better 
than are those of the SCF and the SIPP: the PSID has indeed a lower item nonresponse rate than these alternative data sets 
and thus less need to construct imputed values 

6 The PSID asks about eight broad wealth categories: (1) value of farm or business assets; (2) value of checking and savings 
accounts, money market funds, certificates of deposit, savings bonds, Treasury bills, other Individual Retirement Accounts 
(IRAs); (3) value of real estate other than main home; (4) value of shares of stock in publicly held corporations, mutual funds 
or investment trusts, including stocks in IRAs; (5) value of vehicles or other assets "on wheels"; (6) value of other investments 
in trusts or estates, bond funds, life insurance policies, special collections; (7) value of private annuities or IRAs; (8) value of 
home equity (calculated as home value minus remaining mortgage). More complete definitions of the asset and debt categories 
are available at the PSID web site: |http : / /psidonline . isr . umich . edu/ [ 

7 Available at: http : / /stats . oecd . org/] 

8 When the distribution of wealth is defined over households and not over individuals, a problem arises with regard to the 
possibility of comparing wealth holdings of different units. The reason is that households vary in size and thus wealth levels 
are not a good indicator of their well-being, as households with a different number of members may have different needs in 
the use of wealth even when this is the same order of magnitude across them. In this case, a correction should be made to 
meaningfully compare different situations. This correction is called an equivalence scale. There is a wide range of equivalence 
scales in use in different countries and by different organizations. All take account of household size: in many scales this is the 
only factor, whilst in those taking into account other considerations it is the factor with greatest weight. Choices of equivalence 
scale in recent wealth studies are reviewed in [43J. Here we adopt a simple equivalence scale that is most commonly used in 
international studies [44} where net household wealth is divided by the square root of the number of household members. 
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Table 1: Summary statistics for U.S. household net wealth, 1984-2009 



Stats 










Wave 










1984 


1989 


1994 


1999 


2001 


2003 


2005 


2007 


2009 


Obs 


6,918 


7,113 


7,415 


6,851 


7,195 


7,565 


8,002 


8,289 


8,690 


Mean 


121,613 


135,095 


135,885 


185,055 


189,139 


201,991 


223,506 


256,281 


248,753 


Median 


36,940 


38,988 


42,390 


45,958 


50,735 


50,295 


53,276 


56,744 


39,143 


Skewness 


18.340 


15.592 


13.821 


18.234 


1 A 

19.6m 


15.0 I'd 


17 COO 

1 i.ooo 


zo. 006 


31. (bb 


Kurtosis 


410.821 


364.775 


302.102 


454.349 


598.511 


317.070 


513.130 


909.552 


1,193.073 


Gini 


0.758 


0.759 


0.751 


0.789 


0.774 


0.788 


0.782 


0.803 


0.850 


% with W < 


6.807 


8.096 


8.636 


9.363 


9.409 


9.439 


10.240 


11.137 


14.385 


% with W = 


4.285 


4.554 


4.751 


3.557 


3.428 


3.872 


3.715 


3.725 


4.484 


% with W > 


88.908 


87.350 


86.614 


87.080 


87.163 


86.689 


86.045 


85.138 


81.130 



Source: Authors' own calculations using the PSID supplemental wealth files. 



by a decline towards 3% until 2001. By 2003, the percentage of these households started increasing again 
to almost 4% and stayed nearly the same in the following two waves (2005 and 2007) before reaching, in 
2009, about the same level of 1984. Notwithstanding these differences in the proportions of negatives and 
zeros with regard to time trends and levels, when their joint prevalence is taken into account we find it to 
be relatively high on average (around 14% of the sample size). This situation is quite different from that 
generally faced in the case of income data, where it is often assumed that income can only take on positive 
values — in practice, there may be non-positive incomes but usually the number of these is so small that one 
can just ignore them. By contrast, in the case of net wealth data the assumption of dealing with a positive 
quantity can not be justified, since it is a matter of fact that many people enter a period of indebtedness 
at some point in their life. Therefore, net wealth may legitimately take on negative and zero values, and 
the proportion of such observations could be non-negligible (as in our case) in representative samples of 
the target population^ 

Results on time trend in real mean household net wealth show that it rose continuously by some 111% 
from 1984 to 2007 and then fell by almost 3% between 2007 and 2009, for an overall annual growth rate of 
about 3% over the entire period. The time trend (although not the magnitude of level changes) in median 
net wealth appears to mirror that of the mean. Indeed, the PSID data show median net wealth rising in 
real terms by some 54% from 1984 to 2007 — save for a temporary slight decrease by less than 1% between 
2001 and 2003 — and then quickly reaching the same level as in 1989 by a sharp fall-off of around 31% 
between 2007 and 2009, for an overall annual growth rate of about 0.2% over the twenty-five years. 

The change over time in the relationship between the mean and median is shown in Figure [TJ To 
provide an indication of how the distribution of wealth across households has changed, the evolution of 
the relative positions of households at the two ends of the distribution (i.e, the bottom and top quintile 
groups or bottom and top 20%) is also displayed 1^1 As noted above, both mean and median net wealth 
increased from 1984 to 2007, with the mean typically increasing to a greater extent than the median. This 
suggests that in recent decades wealth became more concentrated among households at the upper end of 
the distribution, and indeed in those years where the divergence between the mean and the median became 
wider — i.e., between 1994 and 2007 — the largest changes in net wealth holdings of households in the top of 
the real distribution were also observed. By contrast, both measures fell during the 2007-2009 recession. 
The relatively greater decline in the median than in the mean suggests that the recession more adversely 
affected the households in the bottom of the wealth distribution than those further up, as shown by the 
worsening relative position for the bottom 20% of them. 



9 For futher discussion on this issue, we refer the reader to [2l|] and [45l ]. 

10 Changes in the aggregates of Figure [TJ over the twenty-five year span are measured by index numbers. An index number is 
calculated by dividing the value in the year of interest by the value in the base year — 1984 in our case — and then multiplying 
the result by 100. The base year index is always 100 and the index for each subsequent year will be above or below 100, 
depending on whether there as been an increase or decrease in the data compared with the base year. 
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Figure 1: Statistics of U.S. household net wealth distribution, 1984-2009 



One might suspect that the differences in the pace of real growth between the mean and median net 
wealth are partly caused by the presence of long and heavy tails in the distribution of U.S. household 
net wealth, particularly at the top of the data range. Indeed, the positive skewness values listed in the 
fourth row of Table [T] suggest that the distribution of net wealth in any one year has a long tail toward the 
upper end, thus indicating a non-trivial prevalence of values that are "extremes" in relation to the rest of 
the data. Furthermore, in each of the wave years the level of kurtosis is huge as compared to the normal 
distribution (fifth row of Table [JJ, meaning that the upper tail of net wealth distribution is inevitably 
"fat" — i.e. declines to zero more slowly than exponentially. As the median would not be affected by the 
extreme values, this results in average net wealth holdings that are consistently larger than median ones 
in all cases. 

Additional information about the fatness of the upper tail of the U.S. net wealth distribution can 
be obtained from visual examination of the sample mean excess plot shown in Figure I^T^l For a se- 
quence of threshold values {wi} i=l N , the mean excess plot reports the mean of exceedances over Wi 
against Wi itself. Putting it differently, this is a plot of the set of pairs (wi, e n (wi)) i=1 jv_i, wriere 

(wi) = N 1 J2f=i+i 71 ~j ( w j ~ w i) 18 the sample mean excess function (weighted by household weights 

/ - - — it".' 



j=i+l ■ 



11 Properties of the mean excess plot are reviewed, for instance, in [461 ] . We do not report plots for each year but they are 
available upon request. Since we are interested here in the upper tail behavior of the distribution, the plot has been drawn 
only for the positive values of net wealth. 
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Figure 2: Mean excess plot for the positive values of U.S. household net wealth in 2003 



=1 ... n) an d { w i}i=i ... n are sample observations ranked from least to greatest. If the points in 
the plot show an upward trend, then this is a sign of heavy-tailed behavior. Exponentially distributed data 
would give an approximately horizontal line and data from a short-tailed distribution would show a down- 
ward trend. In particular, if the empirical mean excess plot seems to follow a reasonably straight line with 
positive slope above a certain net wealth value, then this is an indication of Pareto (power-law) behavior 
in tail. This is precisely the kind of behavior we observe in the 2003 PSID data. In fact, apart from some 
noisiness by the most extreme observations, there is evidence for consistent upward trends of the data and 
straightening out of the plots above some points onwards, hence providing a statistical justification for the 
emergence of power laws as limiting behavior for the very wealthy. 

Does this finding matter when it comes to inequality judgments? Figure [3] displays the pattern of Gini 
coefficient for the distribution of U.S. household net wealth over the period 1984-2009 (the corresponding 
values are reported in the fourth-last row of Tabled]). At least three different sub-periods are shown: from 
the second half of the 1980s to the first half of the 1990s, from the late 1990s to the first half of the 2000s 
and the last time interval (2007-2009). According to the PSID, net wealth inequality remained virtually 
unchanged during the first sub-period. Indeed, the Gini coefficient rose slightly between 1984 and 1989 
(from 0.758 to 0.759) and then fell in 1994 to a level below that of 1984 (0.751). By contrast, inequality 
increased sharply between 1994 and 1999, with the Gini coefficient of net wealth climbing to 0.789. The 
following years still show almost the same degree of inequality: the Gini coefficient was estimated at 0.788 
in 2003 and 0.782 in 2005, except for a temporary decrease to a value of 0.774 in 2001. Finally, between 
2007 and 2009 net wealth inequality was up steeply, with the Gini coefficient advancing from 0.803 to 
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Figure 3: Gini coefficient and net wealth share of top 20% across years 



0.850. 

Figure [3] also displays the evolution between 1984 and 2009 of the share of total net wealth held by 
the richest 20% of households, which amounted on average to around 80% of the whole over the period. 
A noteworthy result is that the observed time pattern of inequality seems to have been driven by the 
conspicuous wealth holdings at the very top end of the distribution. Indeed, as can be seen from the figure, 
the time profile of net wealth share of the wealthiest 20% is analogous to that of Gini coefficient: after 
rising to a peak in 1999, it went down and then started to increase again until 2009 1^1 

To sum up the above, wealth in the U.S. has become more concentrated in recent decades. Net wealth 
inequality increased by the mid-1990s, and the increase was not interrupted during the 2007-2009 recession. 
The share of total net wealth held by the top wealth owners has also grown during the same period, whereas 
at the other end of the wealth distribution there was a sharp increase in the number of households with 
zero or negative net wealth. Needless to say, this has resulted in a widening gap between the rich and the 
poor that advocates more attention be paid to the implementation of appropriate and practical policies 
aimed at reducing inequalities, limiting their negative effects on the socio-economic system and reversing 



the mechanisms producing them 47]. 



12 The correlation coefficient between the two series of Gini coefficient and the net wealth share received by the top 20% is 
0.998, which is highly significant (p- value < O.OOf). 
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(a) Mean (b) Gini 

Figure 4: Observed and predicted values for the mean and Gini coefficient of U.S. household net wealth, 1984-2009. The vertical 
bars denote the symmetric 95% normal-approximation confidence intervals for the empirical values calculated via the bootstrap 
resampling method based on 100 replications. Percent error is calculated as follows: Percent error = \P red7Ct <^ d - obaerved \ x iqq 

r ° r observed 

6.2. Estimation and comparison of finite mixture models for net wealth distribution 

Tables [2] and [3] present the parameter estimates and other relevant statistics arising from the fitting 
of the net wealth distribution models previously discussed to the PSID data from 1984 to 2009. The 
parameters were estimated in all cases by minimizing the negative of the log-likelihood function via a mod- 



ified Newton- Raphson procedure implemented in Stata's ml command 48], with the parameter covariance 
matrix estimates based on the negative inverse Hessian. Convergence was achieved easily within several 
iterations. 

The small value of the errors indicates that all the parameters were very precisely estimated. The 
mixture proportions (the #'s) correspond exactly to the sample estimates shown in Tabled! and the scale 
parameters (the 6's, /3 and A) reflect the changes over the period in both the median and the mean 
among the positive and negative values of real net wealth^ The other parameters (the a's, a, p, k, 
q and s), characterizing distributional shape, are easiest to interpret by comparing predicted values for 
key distributional summary measures with their sample counterparts, as the effect of changing one of 
them is contingent on the value of the other parameters. For example, Figure [J] shows that the overall 
mean net wealth and Gini coeffcient as estimated from the mixture models are very close to their sample 
estimates!^ However, the agreement (both in magnitude and temporal behavior) between the implied and 
sample estimates of the mean and Gini coefficient is much closer for the Singh-Maddala and K-generalized 
mixture models than for the Dagum one. The mean and Gini coefficient associated with the latter model 
are in fact above the 95% upper confidence limit of their corresponding sample estimates in six (from 1989 
to 2005) and three (1994, 2003 and 2005) cases out of 9, respectively, and their percent error turns out to be 
relatively large compared to the other models — save for 1994, where both the mean and Gini predictions 
exhibited the lowest error, and 2009 with respect to the Gini coefficient implied by the Singh-Maddala 



13 The correlation coefficients between the Weibull scale parameters (A) and the two series of the median and mean net 
wealth values among the negatives are close to unity (0.982 and 0.955, respectively) and highly significant (p- value < 0.001 in 
both cases). Similarly, the correlation coefficients between the values of the scale parameter of the Singh-Maddala (b), Dagum 
type I (b) and jc-generalized (/?) distributions and the two series of the median and mean net wealth levels among the positives 
are all significant at the 1% confidence level and equal, respectively, to 0.931, 0.983 and 0.998 for the median and 0.812, 0.925 
and 0.935 for the mean. 

14 The analytic values for the mean and Gini coefficients, also reported in the last two columns of Tables [2] and [3] were 
obtained by substituting the estimated parameters into the relevant expressions given by Eqs. (|16p and (JT7J) with r = 1 for 
the mean and Eqs. 11221) for the Gini. 
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Table 2: Estimated mixture models for the U.S. household net wealth, 1984-2001 a 



Wave Model 



Parameters 



17" 



a (a) 


b {(3) 


P («) 


q 


0i 


62 


s 


A 


0.718 


76,514 


0.374 




0.068 


0.043 


0.578 


4,511 


(0.009) 


(1,663) 


(0.022) 




(0.003) 


(0.002) 


(0.018) 


(382) 


0.757 


373,565 




3.754 


0.068 


0.043 


0.578 


4,511 


(0.012) 


(56,465) 




(0.316) 


(0.003) 


(0.002) 


(0.018) 


(382) 


1.614 


138,163 


0.377 




0.068 


0.043 


0.578 


4,511 


(0.042) 


(5,720) 


(0.015) 




(0.003) 


(0.002) 


(0.018) 


(382) 


0.702 


89,241 


0.367 




0.081 


0.046 


0.619 


6,639 


(0.009) 


(1,998) 


(0.023) 




(0.003) 


(0.002) 


(0.018) 


(473) 


0.743 


459,094 




3.814 


0.081 


0.046 


0.619 


6,639 


(0.012) 


(76,568) 




(0.348) 


(0.003) 


(0.002) 


(0.018) 


(473) 


1.520 


152,781 


0.402 




0.081 


0.046 


0.619 


6,639 


(0.040) 


(7,050) 


(0.017) 




(0.003) 


(0.002) 


(0.018) 


(473) 


0.727 


95,745 


0.379 




0.086 


0.048 


0.716 


10,759 


(0.010) 


(2,083) 


(0.024) 




(0.003) 


(0.002) 


(0.020) 


(629) 


0.769 


449,157 




3.713 


0.086 


0.048 


0.716 


10,759 


(0.012) 


(72,386) 




(0.338) 


(0.003) 


(0.002) 


(0.020) 


(629) 


1.528 


154,125 


0.421 




0.086 


0.048 


0.716 


10,759 


(0.038) 


(6,831) 


(0.017) 




(0.003) 


(0.002) 


(0.020) 


(629) 


0.678 


107,494 


0.389 




0.094 


0.036 


0.751 


11,529 


(0.009) 


(2,579) 


(0.024) 




(0.004) 


(0.002) 


(0.022) 


(642) 


0.724 


477,324 




3.380 


0.094 


0.036 


0.751 


11,529 


(0.012) 


(77,394) 




(0.288) 


(0.004) 


(0.002) 


(0.022) 


(642) 


1.422 


181,486 


0.420 




0.094 


0.036 


0.751 


11,529 


(0.038) 


(9,213) 


(0.018) 




(0.004) 


(0.002) 


(0.022) 


(642) 


0.652 


118,900 


0.326 




0.094 


0.034 


0.724 


11,083 


(0.008) 


(2,778) 


(0.023) 




(0.003) 


(0.002) 


(0.020) 


(623) 


0.683 


980,104 




4.669 


0.094 


0.034 


0.724 


11,083 


(0.010) 


(197,710) 




(0.486) 


(0.003) 


(0.002) 


(0.020) 


(623) 


1.514 


229,564 


0.366 




0.094 


0.034 


0.724 


11,083 


(0.041) 


(10,520) 


(0.015) 




(0.003) 


(0.002) 


(0.020) 


(623) 



logLik 


AIC 


BIC 


Mean c 


Gini 


84,229 


168,471 


168,539 


114,181 


0.741 


84,249 


168,511 


168,579 


111,616 


0.736 


84,230 


168,474 


168,542 


121,361 


0.753 


87,565 


175,143 


175,212 


133,595 


0.754 


87,573 


175,161 


175,230 


130,298 


0.749 


87,583 


175,180 


175,249 


152,050 


0.781 


91,861 


183,736 


183,807 


137,029 


0.751 


91,866 


183,745 


183,816 


133,116 


0.745 


91,879 


183,771 


183,842 


156,209 


0.779 


86,527 


173,067 


173,137 


179,416 


0.781 


86,534 


1 f 3,082 


1 70 ICO 

1 (0,152 


1 70 A *70 


n 77c 
0.775 


86,548 


173,111 


173,181 


212,146 


0.812 


91,354 


182,722 


182,792 


185,349 


0.769 


91,364 


182,742 


182,812 


181,487 


0.765 


91,373 


182,759 


182,829 


211,855 


0.794 



1984 



1989 



1994 



1999 



2001 



K-gen 

SM 

D 

>t-gen 

SM 

D 

K-gen 

SM 

D 

K-gen 

SM 

D 

K-gen 

SM 

D 



Notes: (a) K-gen = ^-generalized mixture model; SM = Singh-Maddala mixture model; D = Dagum mixture 
standard errors, (c) Analytic values obtained by substituting the estimated parameters into Eqs. (|16|) and 
substituting the estimated parameters into Eqs. (|22[) . 

Source: Authors' own calculations using the PSID supplemental wealth files. 



model, (b) Numbers in round brackets: estimated 
(fTTl) with r = 1. (d) Analytic values obtained by 



Table 3: Estimated mixture models for the U.S. household net wealth, 2003-2009 1 



Wave Model 



Parameters' 3 



a (a) 


b(f3) 


P («0 


Q 


0i 


02 


s 


A 


0.665 


120,624 


0.370 




0.094 


0.039 


0.682 


13,602 


(0.008) 


(2,775) 


(0.022) 




(0.003) 


(0.002) 


(0.019) 


(791) 


0.703 


667,612 




3.767 


0.094 


0.039 


0.682 


13,602 


(0.011) 


(112,537) 




(0.329) 


(0.003) 


(0.002) 


(0.019) 


(791) 


1.443 


214,742 


0.400 




0.094 


0.039 


0.682 


13,602 


(0.037) 


(10,101) 


(0.016) 




(0.003) 


(0.002) 


(0.019) 


(791) 


0.632 


138,871 


0.315 




0.102 


0.037 


0.728 


11,804 


(0.008) 


(3,206) 


(0.022) 




(0.003) 


(0.002) 


(0.019) 


(600) 


0.660 


1,371,330 




4.978 


0.102 


0.037 


0.728 


11,804 


(0.010) 


(290,880) 




(0.531) 


(0.003) 


(0.002) 


(0.019) 


(600) 


1.456 


267,209 


0.371 




0.102 


0.037 


0.728 


11,804 


(0.037) 


(12,103) 


(0.015) 




(0.003) 


(0.002) 


(0.019) 


(600) 


0.617 


150,343 


0.307 




0.111 


0.037 


0.670 


13,715 


(0.007) 


(3,469) 


(0.021) 




(0.003) 


(0.002) 


(0.015) 


(713) 


0.645 


1,568,538 




4.995 


0.111 


0.037 


0.670 


13,715 


(0.009) 


(324,452) 




(0.508) 


(0.003) 


(0.002) 


(0.015) 


(713) 


1.450 


303,012 


0.360 




0.111 


0.037 


0.670 


13,715 


(0.037) 


(13,682) 


(0.014) 




(0.003) 


(0.002) 


(0.015) 


(713) 


0.605 


128,792 


0.353 




0.144 


0.045 


0.707 


17,847 


(0.007) 


(3,120) 


(0.022) 




(0.004) 


(0.002) 


(0.014) 


(756) 


0.640 


930,909 




3.988 


0.144 


0.045 


0.707 


17,847 


(0.009) 


(171,313) 




(0.348) 


(0.004) 


(0.002) 


(0.014) 


(756) 


1.334 


245,091 


0.393 




0.144 


0.045 


0.707 


17,847 


(0.033) 


(12,039) 


(0.015) 




(0.004) 


(0.002) 


(0.014) 


(756) 



logLik 


AIC 


BIC 


Mean c 


Gini 


-96,140 


192,294 


192,364 


198,801 


0.784 


-96,151 


192,315 


192,385 


192,121 


0.777 


-96,158 


192,329 


192,399 


232,063 


0.812 


102,462 


204,937 


205,008 


221,517 


0.780 


102,470 


204,954 


205,024 


216,342 


0.775 


102,482 


204,978 


205,048 


264,727 


0.813 


106,810 


213,633 


213,704 


244,070 


0.791 


106,821 


213,657 


213,728 


239,641 


0.787 


106,833 


213,679 


213,750 


291,719 


0.821 


110,583 


221,181 


221,251 


230,967 


0.835 


110,594 


221,202 


221,273 


221,273 


0.828 


110,605 


221,225 


221,295 


295,004 


0.869 



2003 



2005 



2007 



2009 



K-gen 



SM 



D 



K-gen 



SM 



D 



K-gen 



SM 



D 



K-gen 



SM 



D 



Notes: (a) /t-gen = K-generalized mixture model; SM = Singh-Maddala mixture model; D = Dagum mixture model, (b) Numbers 
standard errors, (c) Analytic values obtained by substituting the estimated parameters into Eqs. (|16p and (|17|1 with r = 1. (d) 
substituting the estimated parameters into Eqs. ([22J. 

Source: Authors' own calculations using the PSID supplemental wealth files. 



in round brackets: estimated 
Analytic values obtained by 




Figure 5: Observed and calculated Lorenz curves for the U.S. household net wealth in 2003 



mixture model. Overall, judging by the percent error of the mean and Gini coefficient of the net wealth 
distributions estimated from the three mixture models, the performance of the /^-generalized mixture model 
is appreciably superior to the other ones over most of the time span investigated. 

The parameter estimates reported in Tables [2] and [3] were also used to build estimated Lorenz curves 
by applying Eqs. (|19p . The curves for 2003 are presented in Figure [5] together with the empirical Lorenz 
curve estimate. Even if it is small, one can see a difference between the three predictions, in that the 
Lorenz curve estimated from the Dagum mixture model lies below the empirical one for approximately 
the top 30% of the wealthiest households, while the Singh-Maddala and K-generalized mixture models lead 
to estimated Lorenz curves exhibiting a degree of inequality that is much more in line with the observed 
one. In particular, the mean absolute difference between the empirical Lorenz data and the predicted 
values (averaged from all the survey years) amount to 0.004, 0.007 and 0.002, respectively, for the Dagum, 
Singh-Maddala and K-generalized mixture models, thus indicating once again that the latter model gives 
a better match to the observed data than the other two. 

It is interesting to note that the K-generalized mixture model provides a better fit to most of the 
data than any of the alternative models regardless of the criterion used for comparison. For instance, by 
inspection of AIC and BIC values reported in the fourth- and third-last columns of Tables [2] and [3l it 
emerges that both the selection criteria agree on the ^-generalized mixture model as the preferred one for 
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Table 4: Vuong test for model selection, 1984-2009 a 



Wave 




SM vs 


K-gen 


D vs 


K-gen 




Statistic 


p- value 


Statistic 


p- value 


1984 




-3.917 


9e-05* 


-0.159 


0.874 


1989 




-2.085 


0.037* 


-2.771 


0.006* 


1994 




-1.176 


0.240 


-2.966 


0.003* 


1999 




-1.741 


0.082 


-3.674 


2e-04* 


9nni 




-2.665 


0.008* 


-2.481 


n m v 

U.UIO 


2003 




-2.486 


0.013* 


-2.529 


0.011* 


2005 




-2.121 


0.034* 


-2.336 


0.019* 


2007 




-2.434 


0.015* 


-2.462 


0.014* 


2009 




-1.964 


0.050* 


-2.675 


0.007* 


Notes: 


(a) K-gen 


= ^-generalized mixture model; SM = Singh-Maddala mixture model; D = 


Dagum mixture model. The 


null hypothesis is 


that the competing 


models are equally close to the "true" data 


generating process. * Denotes 5% statistical 


significance. 










Source : 


Authors' 


own calculations using the PSID supplemental wealth files. 







all of the survey waves! 15 ! To see if these differences in the performance of the alternative specifications 
are statistically significant, we adopt the Vuong approach to model selection [51]. This approach sets the 
model selection criterion in a hypothesis testing framework. More specifically, it tests the null hypothesis 
that the models under consideration are equidistant from a unknown "true" model against the alternative 
hypothesis that one model is closer. The test statistic is asymptotically normal under the null hypothesis 
and is quite straightforward to compute. Table H] shows the results of the comparisons for the three mixture 
models. As can be seen, if one takes the 5% as the relevant significance level only in three cases (i.e. when 
comparing to the Singh-Maddala mixture model in the survey years 1994 and 1999 and to the Dagum 
one in 1984) the test concludes that the ^-generalized mixture model is observationally equivalent to its 
competitors, while in all the other cases (more than 83% of all cases) its superiority as a descriptive model 
is found to be statistically significant. 

The above evidence holds vis-a-vis a further check involving goodness of fit indicators such as the root 
mean squared error, defined as the square root of the average squared error between the observed and 
predicted values of the cumulative distribution function. In mathematical terms this is expressed as 



RMSE 



1 N r 12 

\ n £ l F * " Fn {Wl) \ ' (23) 
\ i=l 



where F* (w) is the distribution function deduced from the fitted mixture models and -F/v (w) = 
X^i^Ia (w) /X^i 71 "* denotes the empirical distribution function of the N sample data ordered from 
lowest to highest carrying the 7Tj along is the indicator function of the set A = {w\wi < w} and 7Tj 
refers to the sampling weight of the ith. observation). Clearly, lower values of RMSE indicate a better fit. 
The comparison results between the competing models based on the above criterion are shown in Table [5j 
As can be seen, the K-generalized mixture model of net wealth ranks first for all years but 1984, where it 
is outperformed by the Dagum mixture model. 

Similar results are obtained by additionally performing an Anderson-Darling goodness of fit test that 
data come from the fitted Singh-Maddala, Dagum or ^-generalized mixture model. This test is known 
to be more powerful than other tests based on the empirical distribution function, since it provides equal 
sensitivity at the tails as at the median of the distribution 00 The last three columns of Table ® 



15 Model selection criteria such as the Akaike [3] and Bayesian 5(J information criteria (AIC and BIC) will select, when 
comparing models with the same number of parameters, the model with the smallest log-likelihood value according to the 
formula (2 x logLik) + (d x npar), where npar represents the number of parameters in the fitted model, and d = 2 for the 
usual AIC or d = IniV (N being the number of observations) for the so-called BIC. 

16 The formula used for the test statistic is the one reported by [53|, which allows for weighted observations. Since the 
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Table 5: Goodness of fit comparisons for estimated mixture models of U.S. household net wealth, 1984-2009 



Wave 


Model 


RAISE (xlO*) 


Rank 


A 2 (xlO^) 


p-value a 


Rank 


1984 


K-gen 


1.022 


2 


0.063 


0.673 


2 




SM 


1.192 


3 


0.088 


0.594 


3 




D 


0.986 


1 


0.054 


0.743 


1 


1989 


K-gen 


0.911 


1 


0.045 


0.693 


1 




SM 


0.981 


2 


0.057 


0.644 


2 




D 


1.118 


3 


0.064 


0.594 


3 


1994 


K-gen 


0.936 


1 


0.042 


0.812 


1 




SM 


0.997 


2 


0.049 


0.703 


2 




D 


1.080 


3 


0.056 


0.693 


3 


1999 


K-gen 


0.812 


1 


0.036 


0.782 


1 




SM 


0.924 


2 


0.049 


0.713 


2 




D 


1.058 


3 


0.055 


0.614 


3 


2001 


K-gen 


0.798 


1 


0.038 


0.782 


1 




SM 


0.916 


2 


0.053 


0.663 


3 




D 


1.008 


3 


0.050 


0.733 


2 


2003 


K-gen 


0.716 


1 


0.035 


0.812 


1 




SM 


0.823 


2 


0.047 


0.703 


2 




D 


0.947 


3 


0.050 


0.673 


3 


2005 


K-gen 


0.639 


1 


0.026 


0.802 


1 




SM 


0.740 


2 


0.035 


0.703 


2 




D 


0.882 


3 


0.041 


0.653 


3 


2007 


K-gen 


0.747 


1 


0.034 


0.822 


1 




SM 


0.871 


2 


0.046 


0.624 


2 




D 


0.923 


3 


0.047 


0.713 


3 


2009 


K-gen 


0.822 


1 


0.035 


0.792 


1 




SM 


0.907 


2 


0.046 


0.634 


2 




D 


0.992 


3 


0.048 


0.663 


3 



Notes: (a) Upper-tail p-value obtained by 100 bootstrap replications. The null hypothesis is that data come from the fitted 
^-generalized (ft-gen), Singh-Maddala (SM) or Dagum (D) mixture model. 
Source: Authors' own calculations using the PS1D supplemental wealth files. 



report the test results for the nine sets of data. P- values are always larger than 0.05, meaning that (if one 
takes 5% as the relevant significance level) in all cases the data can be statistically described by the three 
models. However, except for 1984, fitting the ^-generalized mixture model results both in lower values 
of the test statistic and higher p-values, thus offering superior performance over the Singh-Maddala and 
Dagum mixture models. 

Can these findings be ultimately ascribed to the different performance of the alternative densities used 
to characterize positive net wealth values? Figure [6] presents for the 2003 PSID wave the relationship 
between log-rank and log-size along the positive support of the net wealth distribution. This double- 
logarithmic framework, known as the Zipf plot, is natural to use when focusing on the top part of the 
distribution because it accentuates the upper tail, making it easier to detect deviations in that part of the 



distribution of the Anderson-Darling test statistic is only known for data sets truly drawn from any given distribution [541 ] . 
while in our case the underlying distribution is itself determined by fitting to the data and hence varies from one data set 
to the next, the p- values for the test have been derived by making use of a nonparametric bootstrap method [HBJ. That is, 
given our iV-vector of net wealth data, we generated 100 synthetic data sets by drawing new sequences of JV observations 
uniformly at random from the original data. We then fitted each synthetic data set individually to the three mixture models 
and calculated the test statistics for each one relative to its own models. Then we simply counted what fraction of the time 
each resulting statistic was larger than the value for the empirical data. This fraction is the p-value for each fit, and can be 
interpreted in the standard way: if it is larger than the chosen significance level, then the difference between the empirical 
data and the model can be attributed to statistical fluctuations alone; if it is smaller, the model is not a plausible fit to the 
data. 
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Figure 6: Zipf plot for the positive values of U.S. household net wealth in 2003 

distribution from the theoretical prediction of a particular model0 The lines show the predicted Zipf plots 
obtained from the fit of the models considered. As the figure reveals, all of them are in good agreement 
with the actual data in the low-middle range of the positive support of net wealth distribution. However, 
at the top tail there is a systematic departure of empirical observations from the theoretical predictions of 
the mixtures using the Singh-Maddala and Dagum type I specifications as descriptions of the positive net 
wealth values, while in the same part of the distributions the theoretical Zipf plot for the K-generalized 
mixture model lies much closer to the empirical one. This point is of particular relevance in the current 
context, both for the documented presence of long and fat tails towards the upper end of the U.S. net 
wealth distribution and the fact that the upper tail of the three densities accounting for the positive values 
of net wealth is heavy in that it decays like a power function as wealth increases! 18 ! 

7. Summary and conclusions 

This paper mainly deals with the specification, analysis and application of models for net wealth 
distribution with support in the interval (— oo,oo). These are mixtures — or, equivalently, convex repre- 
sentations — of three distributions with non-overlapping intervals, which have the advantage of providing 
a relatively flexible functional form and at the same time retain the advantages of parametric forms that 



For an illustration of basic properties of the Zipf plot see e.g. [5£ 



See [lj, |25|, |26j] on the upper tail behavior of the K-generalized. For the other distributions, see 57, 58] and 
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are amenable to inference. The first distribution is a two-parameter Weibull model that describes the 
distribution of economic units with negative net wealth, i.e. covering the open interval (—00, 0); the second 
is a degenerate distribution with its unit mass concentrated at w = 0; and the third is, alternatively, the 
three-parameter Singh-Maddala, Dagum type I or K-generalized model that accounts for the distribution 
of economic units with positive net wealth, hence defined in the open interval (0, 00). 

We have obtained closed formulas for the different probability functions, moments and standard tools 
for inequality measurement (Le. the Lorenz curve and Gini concentration ratio). Except for the Dagum 



general model of net wealth 17H20I]. to the best of our knowledge this is the first time that the analytical 
properties of finite mixture models for net wealth based on alternative distributions to characterize positive 
values are fully derived. 

The performance of the three mixture models has been checked against real data on U.S. household 
net wealth for different years. Goodness-of-fit comparisons reveal that all the three models are in good 
agreement with actual data, but the departure of empirical observations from the predictions of the Singh- 
Maddala and Dagum mixture models is always larger than in the case of the K-generalized. In particular, 
the latter model suggests a superior fit in the right tail of data with respect to the others in many instances. 

Finite mixture models deserve further attention in future. A feature of these models is that each of the 
parameters may be made a function of covariates summarizing household characteristics. Estimation of 
"heterogeneous" wealth distributions such as these, with distributional shape allowed to vary with personal 
characteristics, provides a route to decomposition analysis of the sources of differences in wealth inequality 
across years or countries^! This could be a good starting point for future research. 
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